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Plugins 



Introduction 


• About the Presentation 

- audience 

• anyone with basic nagios knowledge 

• anyone with basic scripting/coding knowledge 

- what a plugin is 

- how to write one 

- troubleshooting 

• About Me 

- work at NAS (NASA Advanced Supercomputing) 

- used Nagios for 5 years 

• started at Nagios 2.10 

• written/maintain 25+ plugins 
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NASA Advanced Supercomputing 


• Pleiades 

- 1 1 ,31 2-node SGI ICE supercluster 

- 184,800 cores 

• Endeavour 

- 2 node SGI shared memory system 

- 1 ,536 cores 

• Merope 

- 1 ,152 node SGI cluster 

- 13,824 cores 

• Hyperwall visualization cluster 

- 128-screen LCD wall arranged in 8x16 configuration 

- measures 23-ft. wide by 10-ft. high 

- 2,560 processor cores 

• Tape Storage - pDMF cluster 

- 4 front ends 

- 47 PB of unique file data stored 

Ref : http ://www . n as . n asa . g o v/h ecc/ 
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Nagios at NASA Advanced Supercomputing 


• one main Nagios server 

• systems behind firewall send data by nrdp 

• some clusters behind firewall 

- one cluster uses nrpe for gathering data 

- other clusters use ssh 

• Post processor prepares visualization (HUD) data 

- separate daemon 

- Nagios APIs provide configuration and status data 

- provides file read by HUD 

- general architecture adaptable for other uses 
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Plugins - Nag os extensions 


Built-in plugins 

- Aren’t truly built-in, but they come standard when you install 
nagios-plugins 

• check_disk 

• check_ping 
Custom plugins 

- Let you test anything 

- The sky’s the limit - if you can code it, you can test it 
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What are Plugins? 


Nagios configuration to define a service that will use the plugin 
check_mydaemon.pl: 


define service { 
host 

service_description 

check_command 

} 

Iinuxserver2 
Check MyDaemon 
check_mydaemon 

define command { 

command_name 

commandjine 

check_mydaemon 
check_mydaemon.pl -w 5 -c 10 


} 
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Reasons to write your own plugin 


• There isn’t a plugin out there that tests what you want 

• You need to test it differently 
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Guidelines 


• Any Language you want 

• There is only one rule: it must return a nagios-accepted value 


ok (green) 

0 

warning (yellow) 

1 

critical (red) 

2 

unknown (orange) 

3 
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Plugin Psuedocode 


• General outline of what a plugin needs to do 

- initialize object (if object oriented code) 

- read in the arguments 

- set variables 

- do the test 

- return results 

• This is just a suggestion 


National Aeronautics and Space Administration 


For Perl: Nagios::Plugin 


# Instantiate Nagios::Plugin object (the 'usage' parameter is mandatory) 

my $p = Nagios::Plugin->new( 
usage => ”usage_string", 
version => $version_number, 
blurb => ‘brief info on plugin', 
extra => ‘extended info on plugin’ 

); 
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For Perl: Nagios::Plugin (cont). 


# adding an argument ex: check_mydaemon.pl -w 

# define help string neatly - use below instead of qq 

my $hlp strg = ‘-w, --warning=INTEGER:INTEGER\n’ . 

‘ If omitted, warning is generated.’; 

$p->add_arg( 

spec => 'warning|w=s’, 
help => Shlp strg 
required => 1, 
default => 10, 

); 


#accessing the argument 
$p->opts->warning 
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For Perl: Nagios::Plugin (cont). 


# finishing the script: 

$p->nagios_exit( 

return_code => $p->check_threshold($result), 
message => " info on what $result means" 

); 


# if you are not using check_threshold use text for return code 
return_code => ‘OK|WARNING|CRITICAL|UNKNOWN’ 
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For Perl: Nagios::Plugin (cont). 


• When you’ve done your code and have $result to compare 
to the thresholds: 

- $return_code = $p->check_threshold($result) 

- follows nagios convention of min:max 

• check_mydaemon.pl -w 5 will warn on anything > 5 

• check_mydaemon.pl -w :5 will warn on anything > 5 

• check_mydaemon.pl -w 5: will warn on anything < 5 

• check_mydaemon.pl -w 5:7 will warn on anything <5 
or >7 

• if you overlap critical and warning, critical has 
precedent 
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Overcoming issues 


• Test needs elevated privilege 

• nagios can be run as root but is not secure 

- run the test as root via cronjob; write info to a flat file 

- use nagios plugin to read and process the file 

• Output of the test was too big 

- the resulting nrdp command hit a kernel limit 

- use ssh to get the output to the main nagios server 
ex: ssh blah blah 

- use plugin on the main server to process it 
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Nagios perfdata 

• Nagios is designed to allow plugins to return optional 
performance data in addition to normal status data 

- in nagios. cfg enable the process_performance_data option. 

- Nagios collects this information to be displayed on the GUI 

- in the format “|key1=value1,key2=value2 keyN=valueN 

- this can be anything that has a numerical value 
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Troubleshooting 


• The Nagios display says: return code XXX is out of bounds 

- your script returns anything other than 0,1 ,2,3 

- otherwise it is a nagios error. 

• Google is your friend 

- ex: 13 usually means a permission error 

- sometimes all it tells you is “something went wrong 

- these disappeared at our site when we switched to 
Nagios::Plugin 

• try running the plugin from the command line 

- verify who you are running as 

- verify the arguments passed in 
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Troubleshooting (cont). 


Timing is everything! 

- launching too many processes 

- files can get overwritten 

• by cron jobs 

• by multiple nagios processes 

if perfdata is enabled, the perfdata log is the most useful 
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Questions 


• Any Questions 
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